19009年的大流行急剧催化了电子购物者的扩散。电子购物的急剧增长无疑会对旅行需求产生重大影响。结果,运输建模者对电子购物需求建模的能力变得越来越重要。这项研究开发了预测家庭每周送货频率的模型。我们使用经典计量经济学和机器学习技术来获得最佳模型。发现社会经济因素,例如拥有在线杂货会员资格,家庭成员的平均年龄,男性家庭成员的百分比,家庭中的工人数量以及各种土地使用因素会影响房屋送货的需求。这项研究还比较了机器学习模型和经典计量经济学模型的解释和表现。在通过机器学习和计量经济学模型确定的变量效果中找到了一致性。但是,具有相似的召回精度,有序的概率模型是一个经典的计量经济学模型,可以准确预测家庭交付需求的总分布。相反,两个机器学习模型都无法匹配观察到的分布。
translated by 谷歌翻译
图形神经网络(GNN)已被广泛用于表示图数据的表示。但是,对图形数据实际上获得多少性能GNN的理解有限。本文介绍了上下文弹出的GNN框架,并提出了两个平滑度指标,以测量从图形数据获得的信息的数量和质量。然后,一种称为CS-GNN的新型GNN模型旨在根据图的平滑度值改善图形信息的使用。证明CS-GNN比不同类型的真实图中现有方法获得更好的性能。
translated by 谷歌翻译
尽管不变风险最小化(IRM)成功解决了分布式概括问题,但在实践中应用时,IRM仍可以损害最佳性。 IRM的实用变体,例如IRMV1,已被证明与IRM存在显着差距,因此即使在简单的问题中也可能无法捕获不变性。此外,IRMV1中的优化过程涉及两个内在冲突的目标,并且通常需要对客观权重进行仔细的调整。为了纠正上述问题,我们将IRM重新制定为多目标优化问题,并为IRM提出了一种新的优化方案,称为Pareto不变风险最小化(Pair)。对可以在客观冲突下适应优化指导。此外,我们表明对可以赋予实用的IRM变体能够在提供适当的指导时用原始IRM克服障碍。我们对ColoredMnist进行实验,以确认我们的理论和对的有效性。
translated by 谷歌翻译
尽管最近在欧几里得数据(例如图像)上使用不变性原理(OOD)概括(例如图像),但有关图数据的研究仍然受到限制。与图像不同,图形的复杂性质给采用不变性原理带来了独特的挑战。特别是,图表上的分布变化可以以多种形式出现,例如属性和结构,因此很难识别不变性。此外,在欧几里得数据上通常需要的域或环境分区通常需要的图形可能非常昂贵。为了弥合这一差距,我们提出了一个新的框架,以捕获图形的不变性,以在各种分配变化下进行保证的OOD概括。具体而言,我们表征了具有因果模型的图形上的潜在分布变化,得出结论,当模型仅关注包含有关标签原因最多信息的子图时,可以实现图形上的OOD概括。因此,我们提出了一个信息理论目标,以提取最大地保留不变的阶级信息的所需子图。用这些子图学习不受分配变化的影响。对合成和现实世界数据集进行的广泛实验,包括在AI ADED药物发现中充满挑战的环境,验证了我们方法的上等OOD概括能力。
translated by 谷歌翻译
虽然英语虚拟助手已经实现了令人兴奋的表现,但具有巨大的培训资源,但非英语扬声器的需求并没有满足。截至2021年12月,Alexa是世界上最受欢迎的智能扬声器之一,能够支持9种不同的语言[1],而世界上有数千种语言,其中91人被超过1000万人所说根据2019年发布的统计数据[2]。但是,培训以其他语言的虚拟助手比英语更困难,特别是对于那些低资源语言而言。缺乏高质量的培训数据限制了模型的性能,导致用户满意度差。因此,我们使用与Bitod [5]相同的数据集生成管道和端到端对话系统体系结构设计了用于多语言任务的对话系统的高效且有效的培训解决方案,该系统为Bitod [5]采用了一些关键设计选择,以实现简约的自然语言使用正式对话状态的设计代替自然语言输入。这减少了较弱的自然语言模型所带来的错误的空间,并确保模型可以正确提取执行对话状态跟踪所需的基本槽值(DST)。我们的目标是减少每次转弯编码的自然语言量,以及我们调查的关键参数是将作为模型历史源的转弯(h)的数量。我们首先探索转折点,其中越来越多的H开始产生限制返回整体性能。然后,我们检查一个小型H错误是否错误的示例可以在模式下对模型进行分类,以便执行几次射门。最后,将探讨这种方法的局限性,以及是否存在这种方法无法解决的某种类型的例子。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译